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Glucose analogue inhibitors of glycogen phosphorylase, GP, may be of clinical interest in the 
regulation of glycogen metabolism in diabetes. The receptor geometry of glycogen phosphorylase 
b GPb, is available for structure-based design and also for the evaluation of the thermodynamics 
of ligand-receptor binding. Free energy force field (FEFF) 3D-QSAR analysis was used to 
construct ligand-receptor binding models. FEFF terms involved in binding are represented 
by a modified first-generation AMBER force field combined with a hydration shell solvation 
model. The FEFF terms are then treated as independent variables in the development of 3D- 
QSAR models by correlating these energy terms with experimental binding energies for a 
training set of inhibitors. The genetic function approximation, employing both multiple linear 
regression and partial least squares regression data fitting, was used to develop the FEFF 
3D-QSAR models for the binding process and to scale the free energy force field for this 
particular ligand- receptor system. The significant FEFF energy terms in the resulting 3D- 
QSAR models include the intramolecular vacuum energy of the unbound ligand, the 
intermolecular ligand-receptor van der Waals interaction energy, and the van der Waals energy 
of the bound ligand. Other terms, such as the change in the stretching energy of the receptor 
on binding, change in the solvation energy of the system on binding, and the change in the 
solvation energy of the ligand on binding are also found in the set of significant FEFF 3D- 
QSAR models. Overall, the binding of this class of ligands to GPb is largely characterized by 
how well the ligand can sterically fit into the active site of the enzyme. The FEFF 3D-QSAR 
models can be used to estimate the binding free energy of any new analogue in substituted 
glucose series prior to synthesis and testing. 



Introduction 

Structure-based design currently involves the use of 
the structure of the receptor and/or, if available, the 
structure of a ligand-receptor complex to perform 
molecular modeling studies to elucidate the features of 
ligand-receptor binding. The three-dimensional struc- 
ture of receptors and, increasingly, of ligand-receptor 
complexes are available through advances in molecular 
biology, protein expression and purification, X-ray crys- 
tallography, and nuclear magnetic resonance (NMR) 
spectroscopy. Unfortunately, the accurate and reliable 
prediction of the thermodynamics of ligand-receptor 
binding has remained problematic. That is, the compu- 
tational equivalent of the in vitro binding assay has 
remained an elusive goal. 

The correlation between binding affinities and a set 
of two-(2D) and three-dimensional (3D) descriptors for 
a series of ligands has been used to develop quantitative 
structure -activity relationships (QSARs). 1 For those 
cases in which the receptor geometry is known, the 
physicochemical properties derived from ligand-recep- 
tor interactions can potentially be used in building 
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receptor-dependent (RD) 3D-QSARs. Hopfinger and co- 
workers in 1981 reported the correlation of biological 
activities with log P. the water-octanol partition coef- 
ficient of the ligand, and the calculated intermolecular 
binding affinities for anticancer anthracyclines inter- 
calating between adjacent DNA base pairs as a 3D- 
QSAR. 2 Holloway et al. have constructed a regression 
equation which relates calculated interaction energies 
for HIV-1 protease inhibitor complexes and the corre- 
sponding observed in vitro enzyme inhibition. 3 Ortiz et 
al. reported a method termed comparative binding 
energy (COMBINE) analysis for constructing RD 3D- 
QSARs in which the receptor geometry is used in 
computing the candidate QSAR descriptors. 4 A recent 
paper by Tokarski and Hopfinger reported a RD 3D- 
QSAR methodology termed free energy force field 
(FEFF) 3D-QSAR analysis as applied to a set of peptido- 
mimetic renin inhibitors. 5 Succinctly, the principal 
features of FEFF 3D-QSAR analysis are (1) all of the 
enthalpy and entropy contributions to the ligand- 
receptor interaction in a solvent medium are taken into 
consideration, (2) the set of enthalpy and entropy 
contributions to binding are treated as the independent 
variables in developing a 3D-QSAR model for ligand- 
receptor binding, and (3) the optimal FEFF 3D-QSAR 
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Table L Breakdown of the FEFF Interaction Terms, XY, for a Ligand (L) and a Receptor (R) in a Medium (M) 



binding 
components) 



type of interaction energy, XY 



change in internal energy, 
symbols 



ligand L 
ligand L 

solvent medium M 

receptor R 

receptor R 
ligand-receptor RL 



change in entropy, symbols 



intramolecular ligand conformational 

energy LL 
ligand solvation energy LM 
solvent reorganizational energy MM 

intramolecular receptor conformational 

energy RR 
receptor solvation energy RM 
intermolecular ligand-receptor energy LR 



A£L(LL) - £L R (LL) - £L(LL) A5 L (LL) = Slr(LL) - SL(LL) 



A£L(LM) = E LR (LM) - £L(LM) 
AEm(MM) = £Lr(MM) - 

|£L(MM) + £r(MM)] 
A£r(RR) = £Lr(RR) - £r(RR) 

A£r(RM) = £lr(RM) - £r(RM) 
A£ lr (LR) = £Lr(LR) 



A5l(LM) = 5 LR (LM) - SL(LM) 
A5m(MM) = 5lr(MM) - 

[5L(MM) + 5r(MM)] 
ASr(RR) = 5 L r(RR) - 5 R (RR) 

ASr(RM) = 5Lr(RM) - 5r(RM) 
A5Lr(LR) = SL R (LR) 



model is constructed using a genetic algorithm. FEFF 
3D-QSAR analysis is the intermolecular extension of 
molecular shape analysis (MSA) 6 and can be viewed as 
a simulation of an in vitro binding assay. 

Ligand-receptor molecular dynamics simulations 
(MDSs) are used in the FEFF method to assemble the 
ligand-receptor binding states. The smaller the chemi- 
cal system (less atoms), the more computationally 
economical is the corresponding MDS. One of the goals 
of the work reported in this paper is to devise a 
procedure to reduce the size of the receptor model to 
facilitate shorter MDSs while retaining reliable results. 

This paper reports the FEFF 3D-QSAR analysis of a 
set of glucose analogue inhibitors of glycogen phospho- 
rylase b {GPb) . Glycogen is the carbohydrate reserve of 
most metabolicaily active cells in mammals. GP cata- 
lyzes the first step in the phosphorolysis of glycogen to 
glucose- 1 -phosphate. In muscle, glycolysis of glucose-1- 
phosphate provides energy to sustain muscle contrac- 
tion. The liver converts glucose to provide fuel for other 
tissues. GP exists in two interconvertible states through 
reversible phosphorylation, the inactive b form (pre- 
dominantly T state) and the active a form (predomi- 
nantly R state). Hepatic glycogen metabolism is regu- 
lated by glucose through promotion of inactivation of 
GPa 7 Glucose inactivates GPa by competitive inhibition 
of glucose- 1 -phosphate and stabilizes the inactive T 
state. 

Methods 

1. The FEFF 3D-QSAR Formalism. The ligand- 
receptor interaction can be expressed as 



L-M + R-M 
(unbound state) 



K 



(LR)-M 
(bound state) 



(1) 



where L is the ligand, R is the receptor, M is the solvent 
medium, and K is the inhibition constant expressed 
relative to LR dissociation. The difference in free energy 
between the bound and unbound states of a ligand, L, 
to a receptor, R, in a solvent medium, M, can be stated 
as 

= ^ L r " (<?l + C°r) = " RT\n K (2) 

where AG° is the standard binding free energy, G°lr is 
the free energy of the bound or complex state, G°l is 
the free energy of the unbound ligand, G° R is the free 
energy of the unbound receptor, R is the gas constant, 
and Tis the temperature of the system. The free energy 
of an enzyme -ligand complex can be approximately 



broken down into a set of component interactions as 
follows 

C°lr = [ClrOLL) + <^ R (RR) + C^ R (MM) + 

<^ R (LR) + q, R (LM) + Gl R (RM)] (3) 

where Glr(XY) refers to the interaction between X and 
Y. 

The interaction terms can be divided into their 
respective enthalpy, ff^ and entropy, S° L r, contribu- 
tions. 



LR 



LR 



(4) 



At low solute concentration the enthalpy terms, Hur- 
(XY), can be represented by their respective internal 
energies, £lr(XY), 

^lr = £\r = [£Lr(LL) + £L R (RR) + F LR (MM) + 

£Lr(LR) + £ LR (LM) + £^(RM)] (5) 

and the entropy term, 5 L r(XY), contributions as 

Slr = I^r(LL) + 5l R (RR) + 5l R (MM) + S^CLR) + 

5l R (LM) + S^rCRM)] (6) 

The unbound ligand, G° L , and receptor, G° R , free ener- 
gies have the following components 

G° L = [GJLL) + <^(LM) + C\(MM)] (7) 
G° R = [Gr(RR) + C*(RM) + Gr(MM)] (8) 

The enthalpy contributions of L and R at low concentra- 
tion, Hl(XY) and Hr{XY), can also be represented by 
their internal energies, £L(XY) and £r(XY), as in eq 5. 
The complete set of contributions to the internal and 
entropy energies and their representations is given in 
Table 1. 

The terms in Table 1 can be the independent variables 
used in the FEFF 3D-QSAR analysis. However, the free 
energy of binding, AG, can also be represented by the 
individual free energy force field terms for L, R, and 
LR in Table 1 along with their respective weighting 
(regression) coefficients, cij. This representation for AG, 
eq 9, can be used to provide additional descriptors for 
FEFF 3D-QSAR analysis 

AG=a 1 A£' slretch 4- a 2 A£^ nd + a 3 A£ torsion + 

a 4 A£, vdW + a 5 A£ electrostatic + a 6 A£ hydrogen- bonding + 

^ 7 A^ solvation + a 8 TAS (9) 
where A£^ tre tch is the unbound to bound change in 
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internal energy for bond stretching, AJE^nd is the change 
in bond angle bending energy, Aversion is the change in 
torsional energy, Afvdw is the change in van der Waals 
interaction energy, A£>ie«rostatic is the change in elec- 
trostatics interaction energy, A£hydrogen-bonding is the 
change in hydrogen-bonding energy, Ai£a,ivation is the 
change in solvation energy, and AS is the change in the 
entropy of the L, R, M system which can be partitioned 
into component contributions. The hydration shell model 
proposed by Hop finger 8 was included in the potential 
energy function to calculate the solvation energies. 

Koehler and Hopfinger have applied a group additive 
property (GAP) method to calculate the conformational 
entropies of linear chain polymers. 9 The assumption of 
the GAP concept is that some intrinsic contribution to 
any composite property. P[i), is associated with the Ah 
structural group of the molecule. The composite prop- 
erty, P. is simply taken to be the sum of the P(i) 
composing the molecule. The GAP method of Hopfinger 
and Koehler is called torsion angle unit, TAU, theory. 
A TAU is defined by adjacent structural units connected 
by a bond about which some torsion angle 6 occurs. 

The application of the TAU theory to ligand -receptor 
systems was reported by Tokarski and Hopfinger. 5 This 
method assumes torsional conformational entropy makes 
the largest entropic contribution to the ligand -receptor 
system and estimates the component values. The TAU 
values 9 are selected for the torsion angle types found 
in the receptor and inhibitors, and the sum of the 
appropriate TAU entropy values allows an estimation 
of the corresponding ligand and receptor conformational 
entropies. 

In this study, the glucose analogue inhibitors have 
quite limited conformational flexibility. Still, the TAU 
method was applied to a subset of the ligands in the 
training set and ligand conformational entropy esti- 
mated. Variation in conformational entropy was found, 
as expected, to be small, and the TAU entropies did not 
appear in the optimized FEFF 3D-QSAR models for this 
subset. It was, therefore, assumed that the ligand 
contribution from a change in torsional conformational 
entropy upon binding to the receptor is small and nearly 
constant across the analogue series. Further, the change 
in the torsional conformational entropy of the receptor 
upon ligand binding was assumed to be constant across 
the analogue series. No attempt was made to estimate 
entropy from the ensemble of states sampled in the 
MDS by constructing the corresponding partition func- 
tion. 

The internal energy change upon ligand -receptor 
binding is given by 

A£x= W-^ux + ^r*) (10) 

where X represents each of the internal energy contri- 
butions as defined in eq 9. The potential function 
parameters used to calculate the nonbonded, electro- 
static, torsional, bond stretching, and bond angle bend- 
ing energy terms of eq 9 were taken from the AMBER 
force field. 10 Missing force field parameters (torsional, 
bond stretching, and bond angle bending) were scaled 
from a set proposed by Hopfinger 11 and the MM2 force 
field. 12 The set of atoms most similar to those of the 
missing AMBER parameter is identified for a parameter 
which has both AMBER and MM2 (or Hopfinger) values. 



Table 2. Structure— Activity Data for the Glucose Analogue 
Inhibitors of Glycogen Phosphorylase b Used in the FEFF 
3D-QSAR Training Set 

H 

O C W 

/ T^ / 0H 









Ki 


AG303 


com pa 


a 


8 

r 


(mM) 


(kcal/mol) 


1 


H 


NHC(=0)CH 3 


0.032 


6.23 


o 
C 


IT 

ri 




0.039 


6.11 


3 


H 


NHC(=0)CH 2 Br 


o!o44 


6.04 


4 


H 


NHC(=0)CH 2 C1 


0.045 


6.03 


5 


H 


NHC(=0)CsH 5 


0.081 


5.67 


6 


H 


NHC(=0)CH 2 CH 2 CH3 


0.094 


5.58 


7 


H 


NHC(=0)NH 2 


0.14 


5.34 


8 


H 


C(=0)NHCH 3 


0.16 


5.26 


9 


H 


NHC(=0)CH 2 NH 2 


0.37 


4.76 


10 


C(=0)NH 2 


H 


0.37 


4.76 


11 


H 


C(=0)NH 2 


0.44 


4.65 


12 


H 


C(=0)NHNH 2 


0.40 


4.17 


13 


H 


SH 


1.00 


4.16 


14 


CH 2 OH 


H 


1.50 


3.92 


15 


OH 


H 


1.70 


3.84 


16 


H 


C(=0)NHC6H 5 


5.40 


3.14 


17 


H 


OH 


7.40 


2.95 


18 


H 


CH 2 CN 


9.00 


2.84 


19 


OH 


CH 2 OH 


15.80 


2.50 


20 


H 


OCH 3 


24.70 


2.23 


21 


CH 2 NH 2 


H 


34.50 


2.03 


22 


C(=0)NHCH 3 


; H 


36.70 


1.99 


23 


CH 3 


H 


53.10 


1.77 



The ratio of the known parameter from the AMBER and 
MM2 force fields is determined. The unknown AMBER 
parameter value is then scaled by the same ratio against 
the known MM2 value. This linear scaling approxima- 
tion in force field parametrization is further compen- 
sated by the subsequent force field fitting process which 
is central to the FEFF methodology and has been 
described above. 

2. Inhibitory Binding Constants-Dependent 
Variables. A training set of glucose analogue inhibitors 
of GPb were reported along with their inhibitory binding 
constants (/Q. 13-16 The kinetic binding studies were 
performed as described by Martin and co-workers. 13 The 
structure-activity data for the training set of glucose 
analogue inhibitors of GPb are given in Table 2. The K { 
values are expressed as millimolar (mM) units. The AG 
values are calculated from the Ki values using eq 2. 

3. Receptor Geometry. Martin and co-workers 
cocrystallized glucose bound to GP6 at 2.4 A resolu- 
tion. 13 The coordinates of this complex were obtained 
from the Brookhaven Protein Data Bank 18 under the 
PDB entry 2GP6. The starting protein structure used 
in the FEFF 3D-QSAR analysis was the refined struc- 
ture of the glucose complex. 13 Hydrogens were added 
to the PDB structure through an option in the QUANTA 
modeling package. 19 Water molecules located in the 
crystal structure of glucose-GPb complex were not 
included in the FEFF protein model. Steric contact 
violations were identified by 3000 steps of MDS using 
the MOLSIM package. 20 The bad steric interactions 
were relieved by perturbation conformational scans over 
the side chains of the residues causing them. AMBER 
partial atomic charges 10 were assigned to all atoms of 
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Figure 1. Schematic representation of the geometry used to 
determine the size of the model enzyme. The amino acid 
residues centered around the ligand (thick line) are included 
in the 8 A model size (inner concentric circle). The outer circle 
represents the 10 A model. 



the enzyme structure. The potentially ionizable residues 
of the enzyme were modeled as neutral. 

4. Building and Docking the Ligands. The geom- 
etry of bound glucose was used as the starting structure 
for building the ligand analogues of Table 2. The 
Chemlab-II molecular modeling program 21 was used to 
add substituents to the glucose ring. Substituent ge- 
ometries were optimized by fixed valence geometry 
conformational analysis. CNDO/2 charges were assigned 
to the ligands. The conformation and alignment of the 
reference crystal glucose ring structure was used to dock 
the glucose analogues. Bad steric contacts between a 
few of the ligand analogues and side chains of enzyme 
residues were relieved during initial docking. 

5. Enzyme Model Size Determination. The com- 
plete enzyme model contains 13470 atoms which in- 
cludes protons. This large enzyme size would require 
unrealisticaliy extensive MDSs to yield meaningful 
results. Thus, the GP b enzyme system was scaled down 
to economize the MDSs. The largest inhibitor (com- 
pound 16 of Table 2) was docked at the active site to 
determine the minimum size of an effective enzyme 
model needed to do reliable FEFF 3D-QSAR analysis. 
The derivation of the enzyme binding model was done 
using the "pruning" method of Tokarski and Hopfinger. 5 
Spherical enzyme models of 12, 10, and 8 A radii, 
centered around the docked ligand, see Figure 1 , were 
examined in order to prune the enzyme without losing 
binding information. The pruned enzyme models es- 
sentially consist of amino acid residues clustered around 
the active site. Localized conformational changes of the 
residues at the active site were seen for the binding of 
some analogues. 13 "" 16 The pruned models are designed 
to allow for these conformational changes at and near 
the active site. 

The residues that had at least one non-hydrogen atom 
within the pruning sphere were included in the corre- 
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sponding enzyme model. The pruning process usually 
leads to an enzyme model consisting of a number of 
nonbonded (unconnected) peptide residue clusters. Pep- 
tide residue clusters separated by less than five inter- 
vening amino acid residues were "connected" by includ- 
ing the intervening amino acid residues. This approach 
is intended to retain local geometric integrity of the 
enzyme model in the pruning process. 

The enzyme model size was evaluated for both con- 
formational and binding integrity by performing MDSs 
on the scaled down trial models. A MDS of 2 ps at 300 
K was performed on each of the model enzymes of 
different pair-interaction size using a nonbonded cutoff 
of 16 A. The molecular dielectric was 3.5. The modified 
AMBER force field was used in the MDSs. Comparisons 
of the total intermolecular van der Waals and electro- 
static energies, respectively, for each of the three 
different size enzyme models were made. 

A root-mean-square (RMS) fit of the lowest energy 
structure of each of the three model enzymes to the 
starting crystal structure was made. The RMS fit of the 
whole enzyme structure from the MDS with respect to 
the crystal structure was also made. The ligand - 
receptor interaction energies were calculated for each 
of the enzyme models. 

Some atoms of each enzyme model had to be con- 
strained to prevent major deviations from the starting 
crystal structure. The enzyme models contain uncon- 
nected peptides, but assigning fictitious high masses to 
all of the atoms in an enzyme model provides "momen- 
tum reservoirs" which can impart equivalent structural 
and dynamic effects as the "absent" amino acid residues 
of the complete enzyme. Different enzyme models were 
explored by assigning a fictitious mass of 1000, 2500, 
and 5000, respectively, to each atom of an enzyme model 
and then evaluating the RMS fit to the initial X-ray 
crystal structure as described by Tokarski and Hopfin- 
ger. 5 Inhibitors 1, 16, and 23 of Table 2 were used in 
this enzyme pruning and evaluation study. A MDS of 
10 ps, with a step size of 0.5 fs, was performed for each 
of the trial enzyme mass models with a bound ligand 
at 300 K. The maintenance of the structural integrity 
of the complex, as measured by the RMS fit, was used 
to evaluate and select the preferred fictitious mass 
value. It should be pointed out that the conformational 
entropy terms are estimated using the TAU method 
which is independent of the masses assigned in this 
enzyme pruning scheme. 

6. Sampling Temperature Determination. The 
temperature of an MDS can be only approximate to the 
corresponding real temperature of the chemical system. 
The specific representation of the force field used in a 
MDS can also influence the relationship between actual 
and simulation temperatures. Thus, temperature was 
considered as a scaling variable in this FEFF 3D-QSAR 
analysis. 

The temperature sampling schedule scheme devel- 
oped by Tokarski and Hopfinger 5 for the bound and 
unbound states of the ligand enzyme system, using a 
subset of the analogue training set, see Table 3, was 
carried out at 400, 350, 300, 200, 100, 50, 25, 10, and 5 
K. "Mini" FEFF 3D-QSAR models derived from the 
ligand subset were used to determine the preferred 
simulation temperature. That is, the preferred MDS 
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Table 3. Subset of the Glucose Analogue Inhibitor Training 
Set Used in the Determination of the Optimum MDS Sampling 
Temperature 



H 




compd 


a 


fi 


/C,(mM) 


AG303 
(kcal/mol) 


1 


H 


NHC(=0)CH 3 


0.032 


6.23 


4 


H 


NHC(=0)CH 2 C1 


0.045 


6.03 


5 


H 


NHC(=0)C 6 H 5 


0.081 


5.67 


8 


H 


C(=0)NHCH 3 


0.16 


5.26 


15 


OH 


H 


1.70 


3.84 


16 


H 


C(=0)NHC6H 5 


5.40 


3.14 


21 


CH 2 NH 2 


H 


34.50 


2.03 


22 


C(=0)NHCH 3 


H 


36.70 


1.99 


23 


CH3 


H 


53.10 


1.77 



Table 4. r 2 and xv - r 2 Values of the Best FEFF 3D-QSARs 
Realized from the MDSs at Different Simulation Temperatures 



temp (K) 


r 2 


xv- I 2 


400 


0.56 


0.38 


350 


0.62 


0.49 


300 


0.85 


0.79 


200 


0.70 


0.56 


100 


0.45 


0.30 


50 


0.54 


0.33 


25 


0.55 


0.45 


10 


0.33 


0.23 


5 


0.50 


0.29 



temperature corresponds to the best fit of FEFF ther- 
modynamic parameters to the experimental binding free 
energies. The correlation coefficient, r 2 , and leave-one- 
out cross validation coefficient, xv-r 2 , of the best FEFF 
3D-QSAR for each simulation temperature are given in 
Table 4. 

7. Computational Details. The binding simulation 
sampling scheme was initiated by an MDS of 20 ps, 
using a time step of 0.5 fs, on the ligand-enzyme 
complex model. The structures of the models and their 
corresponding FEFF energy terms were recorded every 
0.2 ps of the simulation. Modeling of the unbound state 
consisted of isolating both the bound ligand and the 
corresponding receptor from the lowest energy geometry 
of their complex realized from the bound state MDS 
modeling. A MDS of the unbound ligand was then 
performed at 300 K for 100 ps using a time step of 1 fs. 
A corresponding MDS of the unbound receptor was 
performed for a sampling time of 20 ps, with a step size 
of 0.5 fs, at 300 K. The hydration shell model 8 was used 
to calculate the solvation energy of the lowest energy 
conformation obtained from the MDS of the complex, 
the unbound ligand, and the unbound receptor. The 
lowest energy geometry of the complex, £Lr, unbound 
ligand, £L and unbound receptor, £r, were used, 
respectively, to obtain the corresponding free energy 
force field terms as described in Table 1 and eq 9. The 
ensemble averaged free energy force Held terms were 
also calculated using the trajectories of MDSs as the 
ensembles. 

8. Construction of the FEFF 3D QSAR Models. 

The nonscaled FEFF energy terms were used as de- 
scriptors (independent variables), and the genetic func- 



tion approximation (GFA) optimization method, 22 em- 
ploying multiple linear regression, MLR, was used to 
construct trial QSAR models. The robustness of each 
model was tested by evaluating statistical measures of 
fit which included r 2 , xv^r 2 . the F-statistic, F 9 and the 
iack-of-fit, LOF. 22 - 23 The number of possible FEFF 
descriptors is large compared to the number of ana- 
logues in the training set. The LOF measure in the GFA 
prevents overfitting of data by assigning a penalty for 
the addition of independent variables to a model. 

To determine if the top five 3D-QSAR models provide 
common, or distinct, structure— activity information, the 
correlation coefficients of the residuals in the error 
(observed activity - predicted activity) between pairs 
of models were computed. Equivalent models are ex- 
pected to have identical distributions in the residuals 
of error. Distinct models should have noncorrelated 
patterns in their residuals of fit (error). This type of 
analysis has been suggested by Rogers 24 - 25 as a diag- 
nostic to determine the subset of distinct models among 
a set of good models realized in a GFA analysis. 

Results 

The size of the enzyme model was selected based on 
a combination of geometric stability and the variance 
of the ligand-enzyme interaction energy for the 12, 10, 
and 8 A model radii. The ligand-model enzyme interac- 
tion energies showed no significant differences among 
the three model sizes evaluated for an atom-pair non- 
bonded cutoff of 16 A. The atom RMS fit of each lowest 
energy geometry from the corresponding MDS for the 
12, 10, and 8 A radii models to the enzyme crystal 
structure was 1.06, 1.04, and 1.29 A, respectively. The 
total ligand-enzyme interaction energies for the 12, 10, 
and 8 A models are 44.57, 45.01, and 44.26 kcal/mol, 
respectively. The RMS fit of the whole enzyme structure 
from the MDS with respect to the crystal structure was 
1.30 A. The enzyme model size of 10 A was selected for 
all FEFF final MDSs as a compromise between struc- 
tural integrity and computational efficiency. 

A fictitious mass assignment to all non-hydrogen 
atoms of the enzyme model was found to be a minimum 
necessary condition to maintain the geometric integrity 
of the enzyme model. Moreover, a fictitious mass of 2500 
assigned to each of the atoms in the 10 A enzyme model 
provided optimal geometric model stability (Figure 2). 
In addition, studies on the role of MDS schedule 
indicated that longer simulations did not necessarily 
lead to lower energy structures. However, MDSs of more 
than 20 ps often caused the enzyme model to diverge 
substantially from the crystal geometry. It was found 
that a simulation temperature of 300 K provided a trial 
set of FEFF terms which when used as descriptors 
yielded the best FEFF 3D-QSAR model. It should be 
noted that the bioassays for measuring binding were 
performed at 303 K. 

Overall, a sampling temperature of 300 K on the 10 
A enzyme model, with a fictitious mass of 2500 assigned 
to all enzyme model atoms, and a MDS of 20 ps provided 
the best compromise between computational time and 
the realization of low-energy states of geometries close 
to the crystal structure. The low-energy structures, 
obtained from the MDSs, of all the ligand-enzyme 
systems of the training set were found to have RMS fits 
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Figure 2. Overlap of the protein backbone of the initial 
crystal structure (dark lines) with the low-energy model 
enzyme structure after molecular dynamics simulation at 300 
K using a heavy mass assignment of 2500 to each atom of the 
10 A model. 

s to the crystal structure of less than 1.5 A (Figures 3 
and 4). 



The trial FEFF 3D-QSAR models were constructed 
using the GFA-MLR option of the WOLF program 26 for 
a sample size of 23 analogues. The models derived from 
GFA-MLR analysis are listed in Table 5 along with their 
statistical measures of fit and robustness as discussed 
earlier. A smoothing factor of 0.5 was found to optimize 
model size (number of independent variables) and 
predictiveness. 

The top five FEFF 3D-QSAR models based on the 
applied statistical measures of fit are reported in Table 

5. The FEFF terms found as significant descriptors in 
the GFA-MLR analysis are defined in Table 6. All 
ligands in the training set (no outliers) are used in 
constructing the models in Table 5. The van der Waals 
intermolecular ligand- receptor binding energy, ELR,vdW. 
the intramolecular energy of the unbound ligand, ZT L - 
(LL), and the van der Waals energy of bound ligand, 
£lr,vciw(LL), are found in all five top models and are not 
correlated to other significant FEFF terms found among 
the top five FEFF 3D-QSAR models. The stretching 
energy of the bound iigand, £LR^tretGh(LL), the electro- 
static energy of the bound ligand, E L R,eie«rostatic(LL), and 
the hydrogen-bonding energy of the unbound ligand, 
£L,hb(LL), are present interchangeably in the five top 
models. The variables E L R, e i, AEhb, A^iectrostatic, see Table 

6, although not found in the FEFF 3D-QSAR models. 




Figure 3. A stereoview of the overlap of the active site amino acid residues of the enzyme crystal structure with the low-energy 
representative structure of the model enzyme after molecular dynamics simulation at 300 K with a heavy mass of 2500 assigned 
to each atom of the 10 A model with the a-methyl analogue (compound 23) bound. 




Figure 4, A stereoview of the overlap of the active site amino acid residues of the enzyme crystal structure with the low-energy 
representative structure of the model enzyme after molecular dynamics simulation at 300 K with a heavy mass of 2500 assigned 
to each atom of the 10 A model with the ^-methyl acetyl analogue (compound 1) bound. 
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xv- r Zb 
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-0.10£L(LL) 


- 0.16£LR.vdW + 0.07£Lastretch(LL) - 0.80£lR,vdw(LL) - 0.31 


0.72 


0.58 


11.3 


1.10 
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-0.10£L(LL) 


- 0.17£LR.vdw + O.06A£L^tr«ch(LL) - 0.79£Uvdw(LL) - 0.02 


0.71 
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1.13 
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-0.13£L(LL) 


- 0.26£LR.vdw + 0.08£U*retch(U.) - 0.74A£Lvdw(LL) - 0.83 


0.71 


0.53 
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1.13 


AG = 


-0.10^(LL) 


- 0.19£LR.vdW+ O.OS^bteiectrostaUcCLL) - 0.74£LR.%dw(LL) " 0.61 


0.70 


0.55 
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1.14 


AG = 


-0.10£L(LL) 


- 0.19£UvdW + 0.02£Ub(LL) - 0.74£Lavdw(LL) - 0.68 


0.70 


0.53 


10.6 . 


1.16 



* Correlation coefficient * Cross-validated r 2 (leave-one-out method). 29 c F-statistic. rf LOF is the lack-of-fit measure. 22 - 23 



Table 6. Definitions of the Significant FEFF Terms Found in 
the Top FEFF 3D-QSAR Models 

£l(LL) the intramolecular energy of the unbound ligand 

Euivtw the van der Waals intermolecular 

ligand- receptor energy 
A£L.vdw(LL) the change in the van der Waals energy of the 

ligand upon binding 
£LR,vdw(LL) the intramolecular van der Waals energy of the 

bound ligand 

£L. vdW (L,L) the intramolecular van der Waals energy of the 

unbound ligand 
^UtretchtLL) the stretching energy of the unbound ligand 
£LR.str«rh(LL) the stretching energy of the bound ligand 
A£L^tmch(LL) the change in the stretching energy of the ligand 

on binding 

A£R^tretch(RR) the change In the stretching energy of the 

receptor on binding 
A£R.bcnd(RR) the change in the bending energy of the receptor 

on binding 

A£^retch the change in the stretching energy on binding 

£LR.e!«tro(LL) the intramolecular electrostatic energy 

of the bound ligand 
£Lju>(LL) the intramolecular hydrogen- bonding energy 

of the unbound ligand 
£LR. e r the electrostatic intermolecular ligand-receptor 

energy 

A£w> change in hydrogen-bonding energy of the whole 

system upon binding 
A£^iectrostatic change in electrostatic energy of the whole 

system upon binding 
£L(LM) the solvation energy of the unbound ligand 

A£l(LM) change in solvation energy of the ligand 

upon binding 

A£soiv change in solvation energy of the whole system 

upon binding 

are highly correlated with some of the significant FEFF 
terms in the top models, see Table 7. The descriptors 
within any given model are not significantly correlated 
to one another. 

Partial least squares (PLS) regression yielded a three- 
component FEFF model as being most significant. All 
the top MLR models given in Table 5 contain the three 
descriptors identified by PLS, i.e., the van der Waals 
intermolecular ligand— receptor binding energy, -£LR.vdW. 
the intramolecular energy of the unbound ligand, £L- 
(LL), and the van der Waals energy of bound ligand, 
£LR.vdw(LL). Thus, the FEFF terms used in the top 
models are likely the major thermodynamic properties 
governing the relative binding thermodynamics. The 
relative contributions of the individual FEFF descriptors 
during the GFA model optimization are shown by the 
crossover versus descriptor usage plot in Figure 5. Once 
again the LOF measure prevents overfitting of the data 
and is particularly appropriate in the FEFF 3D-QSAR 
approach since many of the energy terms are derived 
from the same source and the number of energy terms 
(the trial set of independent variables) is large compared 
to the number of observations (analogues in the training 
set). A statistically poor 3D-QSAR model (r 2 = 0.3, xv 
— — 0.12) was obtained when the biological activity 
column was randomized, suggesting that the FEFF 3D- 



QSAR models in Table 5 were not due to random 
correlations. 

To determine if the top five 3D-QSAR models are 
providing common, or distinct, structure— activity in- 
formation, the correlation coefficients of the residuals 
in error (observed activity — predicted activity) between 
pairs of models were computed and are reported in 
Table 8. All of the top five models are highly correlated 
to one another, indicating there is only one unique FEFF 
3D-QSAR model which is selected as the model with the 
highest xv - r 2 value, namely model 1 in Table 5. 

Outliers were defined as those analogues whose 
difference in observed and predicted AG values are 
greater than 2 standard deviations from the mean. The 
resulting two outliers for the training set are compounds 
18 and 19 (Table 2) as can be seen in Figure 6. The 
removal of these two outliers and GFA-MLR refitting 
on the remaining data set of 21 compounds yielded the 
FEFF 3D-QSAR models shown in Table 9. In addition 
to the dispersion terms found in the original best model 
of Table 5, the total LR(M) change in the solvation 
energy upon binding, A£iotv, the change in the solvation 
energy of the ligand on binding, AjEL(LM), and the 
unbound ligand solvation energy, £l(LM), are now 
found as significant descriptors in the best "outlier-free" 
four-descriptor FEFF 3D-QSAR models. The stretching, 
electrostatic, and hydrogen-bonding energy terms of the 
models given in Table 5 are replaced by the change in 
the stretching energy upon binding, Aiistretch. and the 
change in the stretching energy of the receptor upon 
ligand binding, AE*r etC h(RR)- 

GFA-MLR optimization of the FEFF 3D-QSAR mod- 
els was determined by developing three-, four-, and five- 
descriptor families of models (Table 9). This type of 
multiple model representation helps to identify the 
origin and significance of each term in the manifold of 
FEFF 3D-QSAR models. It is possible that the five-term 
models may constitute an overfit case, but these models 
should be considered within the context of the manifold 
set of models in Table 9. The three-descriptor models 
are predominantly comprised of the dispersion energy 
terms and the intramolecular vacuum energy of the 
unbound ligand. The solvation energy term is added to 
the four-descriptor models, and this increases the r 2 
value from 0.77 to 0.88. The five-descriptor models do 
not provide a better "explanation" (significantly in- 
creased data fits) as compared to the four-descriptor 
models and are found to be comprised of the same terms 
found in the three- and four-descriptor models. 

Compound 18, the 0-cyanomethyl analogue, contains 
a rigid side chain, and compound 19 is the only 
compound in the analogue series, other than glucose, 
to have an a- hydroxy substituent. These singular prop- 
erties may be the sources of why these two analogues 
are outliers. The N atom of the cyano group of compound 
18 and the O of the a-hydroxy group of compound 19 
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Table 7. Linear Cross-Correlation Matrix of the Descriptors in the Top FEFF 3D-QSAR Models 3 



enerev 
terms & 


1 


2 


3 


4 


5 


6 


7 


8 


9 


i n 


1 1 
1 1 


1 9 


1 Q 


1 A 


1 


1.00 




























2 


-0.30 


1.00 


























3 


0.08 


-0.31 


1.00 
























4 


0.02 


—0.22 


0.97 


1.00 






















5 


0.14 


0.08 


-0.35 


-0.45 


1.00 




















6 


0.16 


-0.02 


0.10 


0.01 


0.89 


1.00 


















7 


0.06 


0.03 


0.18 


0.19 


-0.11 


-0.03 


1.00 
















8 


0.13 


0.72 


-0.31 


-0.26 


0.09 


-0.03 


0.13 


1.00 














9 


-0.04 


-0.50 


0.38 


0.34 


-0.16 


0.00 


0.68 


-0.64 


1.00 












10 


0.26 


0.10 


0.13 


0.17 


0.02 


0.11 


0.05 


0.09 


-0.03 


1.00 










11 


-0.26 


-0.10 


-0.14 


-0.17 


-0.02 


-0.11 


-0.05 


-0.09 


0.03 


-1.00 


1.00 








12 


-0.02 


-0.26 


0.21 


0.13 


0.20 


0.29 


0.13 


-0.15 


0.21 


-0.77 


0.77 


1.00 






13 


-0.06 


-0.06 


0.08 


0.05 


0.16 


0.21 


-0.05 


-0.20 


0.11 


0.37 


-0.37 


-0.24 


1.00 




14 


0.02 


0.05 


-0.14 


^0.09 


-0.15 


-0.21 


-0.02 


0.19 


-0.16 


-0.35 


0.35 


0.21 


-0.99 


1.00 


15 


-0.03 


-0.04 


-0.06 


-0.04 


-0.14 


-0.18 


-0.01 


0.07 


-0.06 


-0.47 


0.47 


0.34 


-0.98 


0.98 


16 


0.04 


-0.03 


0.14 


0.11 


0.00 


0.06 


-0.09 


-0.17 


0.06 


0.45 


-0.45 


-0.40 


0.92 


-0.92 


17 


-0.05 


0.03 


-0.16 


-0.12 


-0.01 


-0.07 


0.10 


0.18 


-0.05 


-0.45 


0.45 


0.41 


-0.92 


0.92 



15 16 17 



1.00 
-0.94 1.00 
0.93 -1.00 1.00 



a The matrix includes descriptors that are correlated to those FEFF descriptors determined from the top five models using the GFA- 
MLR scheme. * 1 = £L(LL), 2 - Elr.^. 3 = £ L R.stretch(LL), 4 = A£L. st retch(LL), 5 = A^ stretch (RR), 6 « A^ tretch , 7 = £LRvdw(LL). 8 = 
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Figure 5. A plot of the GFA-MLR optimization for predicting 
AG. Descriptor usage plotted as a function of number of 
crossovers in the GFA-MLR analysis is shown here. Symbols: 
O, £" L (LL) use; ■, £LR,vdw(LL) use; a, £lR.vdw use; a, Et.vdvr 
(LL) use; *, £LR.stretch(LL) use; O, £Lhb(LL) use; AjEbtretch- 
(LL) use; ♦ £LR. e i e cm J staac(LL) use. The FEFF terms (descriptors) 
are defined in Table 6. 



Table 8. Linear Correlation Matrix of the Residuals of Error 
for the Top Five FEFF 3D-QSAR Models from GFA-MLR 
Optimization 
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were also found to form hydrogen bonds to the active 
site residues through a water molecule (water bridges) 
(ref). The only source of hydrogen bonding for these 
substituent groups to the active site residues seem to 
be through such water bridges. 14 Water-bridge ligand— 
receptor hydrogen bonding also occurs for compounds 
10 and 15 which are found to have relatively large 
(within the outlier definition) residual values. Ligand— 
receptor hydrogen bonding via water molecule bridges 
is not explicitly considered in our FEFF MDSs. There- 
fore, poor modeling of water-bridge ligand— receptor 
interactions may be another source for generating 
outliers. 
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compound number 

Figure 6. Plot of the observed (♦) and predicted (O) (using 
model 1 of Table 5) binding free energy, AG, versus the 
compound number (Table 2). The outliers are identified in the 
plot. 

Discussion 

The ligand -receptor modeling approximations em- 
ployed in this FEFF 3D-QSAR study are listed in Table 
10. This table also describes how each approximation 
influences the calculations, and suggestions are given 
as to how to minimize the adverse impacts of the 
approximations on the development of FEFF 3D-QSAR 
models. 

Explicit water molecules have not been considered in 
the MDSs. Rather, a hydration shell model to estimate 
solvation energetics has been included in the force field. 
The hydration shell model, when applied over the course 
of a MDS, can produce major distortions in molecular 
geometry. This behavior arises because the force derived 
from the hydration shell potential function is unrealistic 
cally large and operates over the entire length of the 
MDS time step. To minimize this flaw in the represen- 
tation of the force field solvation term, the FEFF 
solvation energies are only computed for selective, and 
representative, low-energy geometries along a MDS 
trajectory. 

The absence of explicit water molecules during the 
MDS eliminates the possibility of forming specific types 
of solute-solvent structures, such as a water bridge 
between the bound ligand and the enzyme as is observed 
in the crystal complexes for some analogues in this data 
set. On the other hand, assignment of explicit waters 
in an MDS requires very long MDSs in order to 
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Table 9. T op Five Three-, Four-, and Five- Descriptor FEFF 3D-QSAR Models for Prediction of AC after Deletion of Outliers 

3D-QSARs , 

_ _____ _ — Three-Descriptor Models 

AC=-0.10£L(LL) -0.16£Uvd\v-0.85£LR.vdw(LL) +0.45 
AG = -0.70£Lvdw(LL) - 0.06A^tretch(RR) " L02£Lr.v_w(LU + 0.82 
AG = -0.85£L.vdw(LL) - 0.04A2_kbend(RR) " 1.15£_Rvdw(LL) + 0.76 
AG = -0.10E_R.vdw - 0.06A£ , R .str e tch(RR) ~ 1 .06£ L Rvdw(LL) - 0.52 
AG- -0.14£l(LL) - 0.26£LavdW - 0.69A£L.vdw (LL) + 0.01 

Four-Descriptor Models 

AG= -0.09£_(LL) - 0.14E L R.vdW - 0 . 05 A Eftstretch (RR) " 0.99£L*vdw(LX) -I- 0.08 
AG = -0.10£l(LL) - O.lSFLRvdw - 0.04A_Wh - 0.9iaR,vdw(LL) + 0.04 
AG = -0.42£L(LM) - 0.65£Lvdw(LL) - O.OeA^R^tchfRR) " 1.0l£LR.vtiw£L) - 1.51 
AG-0.40AE L (LM) - 0.65£Lvdw(LL) - 0.06AER^r«ch(RR) " 1.0l£_R.vdw(LL) - 1.41 
AG= 0.29AEMJV - 0.6l£L.vdw(LU - 0.07A£ , R .str t t C h(RR) ~ l-l0£_Rvdw(LL) - 1.36 

Five-Descriptor Models 
AG= -0.08£L(LL) - 0.12£ LR .vdw - 0.06A^ e tch(RR) - 1.05£Uvdw(LL) + O.IQA^ -1.11 
AG= -0.09£L(LL) - 0.12£LR.vdw - O.Q6A2=ks-«d.(RR) " 0.99£LavdwO-L + S i^^fS 1 ^ " 105 
AG= -0.08£_(LL) - 0.l3£_R.vdW - 0.05A£r^(RR> " 0.99^vdw£L + 0-22S.0-M) - 1 13 
AG= -0.09F L (LL) - 0.l3£LR.vdW - 0.04A£kstretch(RR) ~ l.Q3£U*w(LL) 1 0.192Wetch(LL) " 0.001 
AG= -0.08a(LL) - 0.l3£L R .vdW - 0.08AER^tretch(RR) ~ LOl£bR.vdw(LL) + 0.03A£wtch + 0.21 
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-Correlation coefficient. 6 Cross-vaUdated r 2 (leave-one-out method (ref))- c F-statistic. "LOF is the lack-of-fit measure (ref). 

Table 10. Molecular Modeling Approximations, Their Impact on Modeling, and the Approaches Used in This Study To Minimize 
Approximation Impact 



approximations 



impact on modeling 



approaches to minimize impact 



1. FEFF representation 

a. solvation energies 
estimated using a 
hydration shell model 

b. explicit water 
molecules are not included 

c. entropic contributions 

are considered to be constant 
for the inhibitor analogue series 

2. the LR, L. and R are 
modeled as being neutral 



3. scaled down receptor model 



4. MDS temperature 



5. sampling schemes used 

to explore the geometry- energy 
states of the LR, R, and L 



incorrect balance between solvation energy 
and the rest of the FEFF during MDSs 

hydrogen bonding of ligand through water 

molecules (water bridges) to active site residues 
neglect of conformational flexibility on binding 



multiple protonation states are possible and 
could influence electrostatic energetics 



the scaled down receptor geometry can 
deviate from the crystal geometries over 
a long MDS and some RR and LR 
interactions are eliminated 

balance the enthalpy and entropic 
contributions to AG 



the sampling schemes may be incomplete 
with respect to sampling bound and 
unbound ligand conformations and to 
monitoring the change in geometry of the 
receptor for the bound and unbound states 



consider only the solvation energies for 
low-energy states, or the conformer 
state used to construct the QSAR 
examine outliers within context of 

missing explicit water interactions 
entropy contributions can be estimated, 
if necessary, by a group additive model 
and scaled with respect to temperature 
in the FEFF 3D-QSAR fitting procedure 
the protonation state held constant for the 
entire training set. so error should be 
"constant" over the training set; a neutral 
state approximates solvation and 
counterion effects on FEFF interactions 
heavy masses assigned to 

each of the atoms of the scaled down 
model to model missing momentum 
reservoir of the rest of the enzyme 
the preferred MDS 

temperature corresponds to the best 
FEFF 3D-QSAR model of a subset of 
the training set 
use experimental data for bound 

ligand alignment and ligand -receptor 
geometry for defining the bound and 
unbound ligand reference states 



adequately sample the states of the system and build 
up a meaningful equilibrium profile. In essence, such 
explicit water MDSs on inhibitor-enzyme systems are 
not practical, and the incomplete ensemble sampling 
associated with such attempted simulations very likely 
yields large errors in the estimation of the solvation 
energetics. 

The inhibitors are relatively small, rigid, and high 
analogues to one another so that changes in intramo- 
lecular ligand conformational entropy in the binding 
processes are small and can be neglected in the FEFF. 
Likewise, changes in the receptor geometry appear to 
be small from the X-ray structures of free and bound 
enzymes. Thus, entropy changes due to the receptor 
geometry have not been considered in the development 
of the FEFF 3D-QSAR models. Only the solvation 
entropy inherent to the hydration shell solvation model 
is resident in the development of the QSAR models. 



Van der Waal and stretching energy terms dominate 
as the key descriptors (binding terms) in both the whole 
training set and the "outlier-free" best FEFF 3D-QSAR 
models. This finding suggests that the steric fitting of 
the inhibitor into the receptor active site is the critical 
feature that distinguishes binding among the inhibitors 
in the training set. Moreover, the steric fitting and 
corresponding van der Waals and valence geometry 
energies may, in fact, govern the binding thermodynam- 
ics of this system. However, this conclusion cannot be 
made from the FEFF 3D-QSARs since these models only 
distinguish binding behavior on a relative basis among 
the inhibitors. 

The regression coefficients of the van der Waal and 
stretch energy terms in the best FEFF 3D-QSAR 
models, see Tables 5 and 9, are all negative. This 
general form of the regression equations supports a 
steric fitting model for explaining enzyme- inhibitor 
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binding. Elr.vcjw. £LR.vdw(LL), and £l(LL) are increas- 
ingly negative as each of these interactions becomes 
increasingly stabilizing. Thus, the better (more stabiliz- 
ing) each of these interactions, the more positive (better 
binding) is their respective contributions to AG. Con- 
versely, A£R. stretch , A£; tretch , and ^.stretch are increas- 
ingly positive as the valence bond geometry is increas- 
ingly distorted which, in turn, produces a decrease in 
AG. The less the distortions in valence bond geometry 
of the ligand and receptor, the better the binding. 

Electrostatics, hydrogen bonding, and solvation en- 
ergetics show up in some of the four-descriptor FEFF 
3D-QSAR models in Tables 5 and 9. Nevertheless, these 
interaction energy terms are minor descriptors relative 
to the van der Waal and bond stretching energy. It 
would seem that the hydrogen bond water bridges 
between the enzyme and some inhibitors, as observed 
by X-ray, are the only significant nonsteric binding 
interaction. 

The GFA-MLR analysis provides a family of QSAR 
models for understanding the contributions of the FEFF 
to account for the explanation in the variance in the 
binding affinities of the glucose inhibitor analogue 
series. The dispersion energy terms, including the 
intramolecular vacuum energy of the unbound ligand, 
the intermolecular van der Waals interaction energy, 
and the van der Waals energy of the bound ligand 
account for 77% (r 2 = 0.77) of the variance in the 
biological activity over the training set. The inclusion 
of the FEFF solvation terms, as seen in Table 9, provide 
for additional explanation of the variance in inhibitory 
activity. Thus, for the training set of glucose analogue 
inhibitors of GPZ>, the solvation and dispersion energet- 
ics provide a basis for the understanding of the variance 
in the binding affinities. 

Reality is stretched with the finding that the absolute 
total energy of the ligand in the unbound state, E L (LL), 
is a major descriptor term in the FEFF 3D-QSAR 
models. The explanation of this rinding is uncovered by 
monitoring the behavior of the three related descriptors, 
£L(LL), the total absolute bound ligand energy, £lr(LL), 
and their difference, A£Lr(LL). During the GFA-MLR 
optimization all three descriptors survive for a while, 
but A£Lr(LL) is the first to be lost of the three in the 
latter phases of model optimization. A difference term 
of the form aE LR {LL) - bE L {LL) survives in the models 
until very near the end of the optimization process when 
£lr(LL) is finally lost. Thus, ^(LL) is "statistically'' 
representing the difference in the total bound and 
unbound ligand energies which does make physical 
sense in a ligand-receptor binding process. We stress 
that an investigator not pursue increasing inhibitory 
binding of new ligands by designing the ligands to be 
of low absolute intramolecular energy, £L(LL), through, 
for example, intramolecular ligand hydrogen bonding. 
Rather, the investigator should pursue enhancing in- 
hibitory potency of new ligands by minimizing the 
difference between bound and unbound ligand confor- 
mational energy. 

An FEFF 3D-QSAR is not a QSAR in the "classic" 
sense. It is a force field that can be used to predict 
ligand-receptor binding thermodynamics. The predic- 
tive use of this force field is in doing virtual screening 
of new hypothetical ligands. This application of the 
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FEFF 3D-QSAR models will be one of the topics of a 
paper which is in preparation. However, investigators 
will, and probably should, inspect an FEFF 3D-QSAR 
for conceptual insight into the feature/descriptor 
requirements for activity. Still, investigators should not 
be surprised, nor disappointed, if the descriptors do not 
readily admit to a level of interpretation that allows 
specific compounds to be identified for synthesis. The 
dispersion energy term in the FEFF 3D-QSAR models 
derived in this work is an example of a descriptor that 
cannot be readily interpreted in terms of specific 
candidate compounds to synthesize but, nevertheless, 
is a well-defined property of a molecule and molecular 
systems. 

The FEFF 3D-QSAR models developed in this study, 
when taken in comparison to FEFF binding models 
developed for other inhibitor- enzyme systems, suggest 
the FEFF methodology has discriminating capabilities 
to dissect the binding thermodynamic behavior of 
ligand-receptor systems. In the renin- peptidomimetic 
inhibitor study of Tokarski and Hopfinger, 5 the intrinsic 
stability of the ligand, £L(LL), and Elr.vciw are found to 
be major descriptors in the AG FEFF 3D-QSAR models. 
However, the change in solvation free energy upon 
binding as well as the changes in the binding entropies 
of both the ligand and the receptor are also significant 
descriptors in the best FEFF 3D-QSAR models. 

In an unpublished FEFF 3D-QSAR analysis of a set 
of acetylcholinesterase analogue inhibitors to Aricept 
[E2020] 27 (used for treatment of Alzheimer's disease), 
the major FEFF descriptors were found to be the 
ligand-receptor hydrogen-bonding energy and the ligand 
solvation binding energy. Overall, as observed from 
experimental studies on the binding thermodynamics 
of ligand-receptor systems, 28 different types of binding 
interactions are predicted by FEFF 3D-QSAR analysis 
to play dominant roles in different ligand-receptor 
systems. 
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